Nephrology
Supplementary Material for UltraMedical
Beyond model-based scoring, previous studies have also attempted to rank instructions directly based on length. As illustrated on the right side of Figure 2, the correlation between model-based scores and lengths is very low, indicating that the evaluator prioritizes assessing instruction complexity rather than merely its length.
One-to-Multiple: A Progressive Style Transfer Unsupervised Domain-Adaptive Framework for Kidney Tumor Segmentation
In multi-sequence Magnetic Resonance Imaging (MRI), the accurate segmentation of the kidney and tumor based on traditional supervised methods typically necessitates detailed annotation for each sequence, which is both time-consuming and labor-intensive. Unsupervised Domain Adaptation (UDA) methods can effectively mitigate inter-domain differences by aligning cross-modal features, thereby reducing the annotation burden. However, most existing UDA methods are limited to one-to-one domain adaptation, which tends to be inefficient and resourceintensive when faced with multi-target domain transfer tasks. To address this challenge, we propose a novel and efficient One-to-Multiple Progressive Style Transfer Unsupervised Domain-Adaptive (PSTUDA) framework for kidney and tumor segmentation in multi-sequence MRI. Specifically, we develop a multi-level style dictionary to explicitly store the style information of each target domain at various stages, which alleviates the burden of a single generator in a multi-target transfer task and enables effective decoupling of content and style. Concurrently, we employ multiple cascading style fusion modules that utilize point-wise instance normalization to progressively recombine content and style features, which enhances cross-modal alignment and structural consistency. Experiments conducted on the private MSKT and public KiTS19 datasets demonstrate the superiority of the proposed PSTUDA over comparative methods in multi-sequence kidney and tumor segmentation. The average Dice Similarity Coefficients are increased by at least 1.8% and 3.9%, respectively. Impressively, our PSTUDA not only significantly reduces the floating-point computation by approximately 72% but also reduces the number of model parameters by about 50%, bringing higher efficiency and feasibility to practical clinical applications.
AllSim: Simulating and Benchmarking Resource Allocation Policies in Multi-User Systems
Numerous real-world systems, ranging from healthcare to energy grids, involve users competing for finite and potentially scarce resources. Designing policies for repeated resource allocation in such real-world systems is challenging for many reasons, including the changing nature of user types and their (possibly urgent) need for resources. Researchers have developed numerous machine learning solutions for determining repeated resource allocation policies in these challenging settings. However, a key limitation has been the absence of good methods and test-beds for benchmarking these policies; almost all resource allocation policies are benchmarked in environments which are either completely synthetic or do not allow any deviation from historical data. In this paper we introduce AllSim, which is a benchmarking environment for realistically simulating the impact and utility of policies for resource allocation in systems in which users compete for such scarce resources. Building such a benchmarking environment is challenging because it needs to successfully take into account the entire collective of potential users and the impact a resource allocation policy has on all the other users in the system. AllSim's benchmarking environment is modular (each component being parameterized individually), learnable (informed by historical data), and customizable (adaptable to changing conditions). These, when interacting with an allocation policy, produce a dataset of simulated outcomes for evaluation and comparison of such policies. We believe AllSim is an essential step towards a more systematic evaluation of policies for scarce resource allocation compared to current approaches for benchmarking such methods.
AllSim: Simulating and Benchmarking Resource Allocation Policies in Multi-User Systems
Numerous real-world systems, ranging from healthcare to energy grids, involve users competing for finite and potentially scarce resources. Designing policies for repeated resource allocation in such real-world systems is challenging for many reasons, including the changing nature of user types and their (possibly urgent) need for resources. Researchers have developed numerous machine learning solutions for determining repeated resource allocation policies in these challenging settings. However, a key limitation has been the absence of good methods and test-beds for benchmarking these policies; almost all resource allocation policies are benchmarked in environments which are either completely synthetic or do not allow any deviation from historical data. In this paper we introduce AllSim, which is a benchmarking environment for realistically simulating the impact and utility of policies for resource allocation in systems in which users compete for such scarce resources. Building such a benchmarking environment is challenging because it needs to successfully take into account the entire collective of potential users and the impact a resource allocation policy has on all the other users in the system. AllSim's benchmarking environment is modular (each component being parameterized individually), learnable (informed by historical data), and customizable (adaptable to changing conditions). These, when interacting with an allocation policy, produce a dataset of simulated outcomes for evaluation and comparison of such policies. We believe AllSim is an essential step towards a more systematic evaluation of policies for scarce resource allocation compared to current approaches for benchmarking such methods.
Deep Multi-task Gaussian Processes for Survival Analysis with Competing Risks
Designing optimal treatment plans for patients with comorbidities requires accurate cause-specific mortality prognosis. Motivated by the recent availability of linked electronic health records, we develop a nonparametric Bayesian model for survival analysis with competing risks, which can be used for jointly assessing a patient's risk of multiple (competing) adverse outcomes. The model views a patient's survival times with respect to the competing risks as the outputs of a deep multi-task Gaussian process (DMGP), the inputs to which are the patients' covariates. Unlike parametric survival analysis methods based on Cox and Weibull models, our model uses DMGPs to capture complex non-linear interactions between the patients' covariates and cause-specific survival times, thereby learning flexible patient-specific and cause-specific survival curves, all in a data-driven fashion without explicit parametric assumptions on the hazard rates. We propose a variational inference algorithm that is capable of learning the model parameters from time-to-event data while handling right censoring. Experiments on synthetic and real data show that our model outperforms the state-of-the-art survival models.
Supplementary Materials for M
A.1 Extraction Process As mentioned in Section 2, the patient notes used for M The motivation of this track was to challenge participants to obtain relevant articles that can help answer potential questions for a particular patient note. The patient notes 2014 and 2015 are synthetic patient notes hand-written by individuals with medical training, but the 2016 dataset consists of real patient summaries coming from electronic health records. TREC Clinical Trials 125 137.7 This track consists of 125 patient notes, where 50 notes are from the year of 2021 and 75 notes are from the year of 2022. This track was meant to have participants retrieve previous clinical trials from ClinicalTrials.gov that best match the symptoms described in the patient note. The notes from both tracks are synthetic notes written by individuals with medical training meant to simulate an admission statement from an electronic health record (EHR). MedQA-USMLE 12,893 135.8 Questions from a multiple-choice form a professional medical board exam which can include patient summaries and will ask questions about particular issues based on the summary. We used GPT-3.5-turbo to identify eligible patient notes for each calculator. We did this by shortlisting the notes that had at least one relevant required for a given calculator. We kept the notes that had all of the numeric parameters needed for each calculator. These notes were also filtered to have enough categorical variables inferred such that 50% of the total number of attributes were present for a given patient note. The attribute extractions of these notes were then verified for by authors of this paper. Extraction 1 Prompt: For more details on step 1, we provided a set of 32 parameters which cover at least one attribute needed for each of the 55 calculators. For each note in Open-Patients, we applied the prompt shown below to determine which of the 32 parameters could be extracted from each note.